Perspectives of approximate dynamic programming

نویسنده

  • Warren B. Powell
چکیده

Approximate dynamic programming has evolved, initially independently, within operations research, computer science and the engineering controls community, all searching for practical tools for solving sequential stochastic optimization problems. More so than other communities, operations research continued to develop the theory behind the basic model introduced by Bellman with discrete states and actions, even while authors as early as Bellman himself recognized its limits due to the “curse of dimensionality” inherent in discrete state spaces. In response to these limitations, subcommunities in computer science, control theory and operations research have developed a variety of methods for solving different classes of stochastic, dynamic optimization problems, creating the appearance of a jungle of competing approaches. In this article, we show that there is actually a common theme to these strategies, and underpinning the entire field remains the fundamental algorithmic strategies of value and policy iteration that were first introduced in the 1950’s and 60’s. Dynamic programming involves making decisions over time, under uncertainty. These problems arise in a wide range of applications, spanning business, science, engineering, economics, medicine and health, and operations. While tremendous successes have been achieved in specific problem settings, we lack general purpose tools with the broad applicability enjoyed by algorithmic strategies such as linear, nonlinear and integer programming. This paper provides an introduction to the challenges of dynamic programming, and describes the contributions made by different subcommunities, with special emphasis on computer science which pioneered a field known as reinforcement learning, and the operations research community which has made contributions through several subcommunities, including stochastic programming, simulation optimization and approximate dynamic programming. Our presentation recognizes, but does not do justice to, the important contributions made in the engineering controls communities. W.B. Powell ( ) Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544, USA e-mail: [email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Incremental Dynamic Analysis Using Reduction of Ground Motion Records

Incremental dynamic analysis (IDA) requires the analysis of the non-linear response history of a structure for an ensemble of ground motions, each scaled to multiple levels of intensity and selected to cover the entire range of structural response. Recognizing that IDA of practical structures is computationally demanding, an approximate procedure based on the reduction of the number of ground m...

متن کامل

OPTIMIZATION OF A PRODUCTION LOT SIZING PROBLEM WITH QUANTITY DISCOUNT

Dynamic lot sizing problem is one of the significant problem in industrial units and it has been considered by  many researchers. Considering the quantity discount in  purchasing cost is one of the important and practical assumptions in the field of inventory control models and it has been less focused in terms of stochastic version of dynamic lot sizing problem. In  this paper, stochastic dyn...

متن کامل

Expected Duration of Dynamic Markov PERT Networks

Abstract : In this paper , we apply the stochastic dynamic programming to approximate the mean project completion time in dynamic Markov PERT networks. It is assumed that the activity durations are independent random variables with exponential distributions, but some social and economical problems influence the mean of activity durations. It is also assumed that the social problems evolve in ac...

متن کامل

An Introduction to Adaptive Critic Control: A Paradigm Based on Approximate Dynamic Programming

Adaptive critic control is an advanced control technology developed for nonlinear dynamical systems in recent years. It is based on the idea of approximate dynamic programming. Dynamic programming was introduced by Bellman in the 1950’s for solving optimal control problems of nonlinear dynamical systems. Due to its high computational complexity, applications of dynamic programming have been lim...

متن کامل

Robust Design of Dynamic Cell Formation Problem Considering the Workers Interest

To enhance agility and quick responding to customers' demand, manufacturing processes are rearrenged according to different systems. The efficient execution of a manufacturing system depends on various factors. Among them, cell design and human issue are the pivotal ones. The proposed model designs cellular manufacturing systems using three objective functions from three different perspectives,...

متن کامل

On Sequential Optimality Conditions without Constraint Qualifications for Nonlinear Programming with Nonsmooth Convex Objective Functions

Sequential optimality conditions provide adequate theoretical tools to justify stopping criteria for nonlinear programming solvers. Here, nonsmooth approximate gradient projection and complementary approximate Karush-Kuhn-Tucker conditions are presented. These sequential optimality conditions are satisfied by local minimizers of optimization problems independently of the fulfillment of constrai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Annals OR

دوره 241  شماره 

صفحات  -

تاریخ انتشار 2016